Nominal Coreference Annotation in IberEval2017: The Case of FORMAS Group
نویسندگان
چکیده
This work describes the participation of the FORMAS group from Federal University of Bahia (UFBA) in the Shared Task on Collective Elaboration of a Coreference Annotated Corpus for Portuguese Texts for IberEval 2017. As such, it describes the creation of a corpus annotated with coreference information for the Portuguese language. We discuss the choices adopted oin the annotation process, as well as the results obtained and their possible application to the development of methods and systems focusing on the processing of texts in portuguese.
منابع مشابه
The Coding Scheme for Annotating Extended Nominal Coreference and Bridging Anaphora in the Prague Dependency Treebank
The present paper outlines an ongoing project of annotation of the extended nominal coreference and the bridging anaphora in the Prague Dependency Treebank. We describe the annotation scheme with respect to the linguistic classification of coreferential and bridging relations and focus also on details of the annotation process from the technical point of view. We present methods of helping the ...
متن کاملPolish Coreference Corpus
The Polish Coreference Corpus (PCC) is a large corpus of Polish general nominal coreference built upon the National Corpus of Polish. With its 1900 documents from 14 text genres, containing about 540,000 tokens, 180,000 mentions and 128,000 coreference clusters, the PCC is among the largest coreference corpora in the international community. It has some novel features, such as the annotation of...
متن کاملInteresting Linguistic Features in Coreference Annotation of an Inflectional Language
This paper reports on linguistic features and decisions that we find vital in the process of annotation and resolution of coreference for highly inflectional languages. The presented results have been collected during preparation of a corpus of general direct nominal coreference of Polish. Starting from the notion of a mention, its borders and potential vs. actual referentiality, we discuss the...
متن کاملMultilingual corpora with coreferential annotation of person entities
This paper presents three corpora with coreferential annotation of person entities for Portuguese, Galician and Spanish. They contain coreference links between several types of pronouns (including elliptical, possessive, indefinite, demonstrative, relative and personal clitic and non-clitic pronouns) and nominal phrases (including proper nouns). Some statistics have been computed, showing distr...
متن کاملDisagreement Dissected: Vagueness as a Source of Ambiguity in Nominal (Co-)Reference
Since the early investigations by Hirschman et al. (1997) and the critique of the MUC-7 annotation scheme put forward by van Deemter and Kibble (2000), several large corpora have been annotated with coreference relations, with refinements in terms of annotation schemes (Poesio, 2004), as well as in terms of support by the annotation tools. After van Deemter and Kibble and their critique of core...
متن کامل